feat(myopencre): improve CSV import validation (export-compatible) #682

PRAteek-singHWY · 2025-12-28T22:34:11Z

⚠️ This PR depends on:

feat(myopencre): add CSV upload UI and wire to existing import endpoint #664
Please review this PR after the above PR is merged.

Note for reviewers

The intended change in this PR is limited to
application/web/web_main.py (CSV import validation logic).

Any additional file diffs shown by GitHub are inherited from branch history / stacking and are not part of the validation change itself.

Please let me know if you’d prefer this PR to be rebased or split into a narrower diff.

Summary

This PR improves server-side validation for the MyOpenCRE CSV import endpoint, aligning it closely with the CSV format produced by the CRE catalogue export flow.

The intent is not to introduce strict or surprising validation rules, but to ensure that:

every CSV produced by the exporter can be safely imported
malformed or ambiguous input is rejected early with clear, structured errors
exporter artifacts (padding rows, empty rows) do not cause failures

What this PR does

File-level validation

Rejects missing, empty, non-CSV, or non–UTF-8 uploads
Returns consistent, structured error responses

Schema / header validation

Requires at least one CRE* column
Requires standard|name and standard|id
Rejects rows with more columns than the header (misaligned CSVs)

Row-level validation (export-compatible)

Completely empty rows are ignored (exported templates include padding rows)
Rows without any CRE values are ignored
CRE entries are validated only when present (<CRE-ID>|<Name>)
Malformed CRE entries return row-specific validation errors

No-op import guard

If a file contains no importable rows after filtering, the request returns success with no changes
This avoids confusing partial imports or unnecessary failures

What is intentionally ignored (by design)

Empty rows
Padding rows from exported templates
Rows without CRE mappings
Extra unused columns (as long as the CSV structure itself is valid)

This mirrors how production importers typically behave and allows exported files to be round-tripped without manual cleanup.

What is out of scope for this PR

Frontend error presentation / formatting
UX changes around errors

These will be handled in follow-up PRs to keep this change focused and reviewable.

Why this approach

The importer now accepts everything the exporter produces, ignores exporter padding rows, and enforces validation only where semantic meaning exists.

This keeps the import process resilient while still preventing invalid data from entering the system.

Dependency note

This PR is stacked on top of:

feat(myopencre): add CSV upload UI and wire to existing import endpoint (#664)

It should be reviewed and merged after that PR.

Feedback welcome

If any validation behavior is too permissive or too strict, or if there are exporter edge cases I may have missed, I’d really appreciate guidance and am happy to adjust.

PRAteek-singHWY · 2025-12-28T22:34:57Z

Hi! @northdpole Sir I spent a few hours validating this against the CSV produced by the CRE catalogue exporter and tried to keep the importer behavior as close to that format as possible.

If any of the validation rules here feel too permissive, too strict, or misaligned with intended usage, I’d really appreciate feedback and guidance.

northdpole · 2026-01-03T22:39:01Z

application/web/web_main.py

+
    if file is None:
-        abort(400, "No file provided")
+        return (


Excellent work! thanks for adding the validation.
One thing: this very helpful validation is a bit confusing and out of place in the "controllers" file.
Ideally this file only handles requests/responses. Why don't we move this validation to the spreadsheet parsers file in the appropriate function?

This way not only we get a bit clearer code but also we get a start on a spreadsheet validation utility we can eventually make into its own module.
For now just moving this validation logic into the appropriate function in the spreadsheet handlers module is enough 👍

Excellent work! thanks for adding the validation. One thing: this very helpful validation is a bit confusing and out of place in the "controllers" file. Ideally this file only handles requests/responses. Why don't we move this validation to the spreadsheet parsers file in the appropriate function?

This way not only we get a bit clearer code but also we get a start on a spreadsheet validation utility we can eventually make into its own module. For now just moving this validation logic into the appropriate function in the spreadsheet handlers module is enough 👍

Thank you @northdpole , this makes complete sense — agreed 👍

You’re right that this validation logic doesn’t really belong in the controller layer. I’ll move the CSV validation into the appropriate spreadsheet parser / handler function so the controller is limited to request/response handling.

This should also give us a cleaner starting point for a reusable spreadsheet validation utility, as you suggested. I’ll refactor accordingly and update the PR shortly. Thanks again for the guidance!

… Project

- Validate file type, encoding, and required headers - Accept CSVs generated from CRE catalogue export - Skip empty and padding rows present in exported templates - Validate CRE format only when CRE references exist - Guard against misaligned rows with extra columns - Return structured validation errors before import This keeps the importer aligned with the exporter while preventing malformed inputs from causing server errors.

PRAteek-singHWY · 2026-01-04T11:18:49Z

Thanks a lot @northdpole Sir for the thoughtful feedback — much appreciated 👍

I’ve addressed the concern about validation placement across this PR stack .

What changed based on your feedback
• CSV validation logic has been moved out of the controller layer and into the spreadsheet parsing layer.
• The parser entry point (parse_export_format) now acts as the single place where incoming CSVs are validated and parsed.
• Structural and row-level validation rules live in a dedicated helper (validate_import_csv_rows), which is invoked internally by the parser before any import logic proceeds.

This keeps the controller focused strictly on request/response handling, while centralizing CSV validation in a place that’s easier to reason about, test, and evolve (with a clear path toward a standalone validation utility if needed).

PR overview (for context)
• #682 – Backend CSV import validation aligned with export format
• #683 – Frontend handling and display of validation errors
• #684 – Clear UI messaging for no-op / empty imports
• #685 – CSV import preview and confirmation flow for safer imports
• #686 – Inline help and guidance for CSV preparation

Each PR builds incrementally on the previous one, but validation itself now consistently happens at the parser level, not in the controllers.

I’m definitely looking forward to improving the validation further — especially around clearer error semantics and extensibility as more CSV use-cases come in.

Thanks again for the guidance, it helped clean up the design significantly.
Happy to adjust naming or structure further if you’d prefer a different direction.

PRAteek-singHWY and others added 8 commits December 15, 2025 12:06

feat(myopencre): add initial MyOpenCRE page and CSV template download

944ca22

Merge branch 'main' into myopencre-csv-download

c3a747e

feat(myopencre): add initial MyOpenCRE page and CSV template download

54235ad

Merge branch 'main' into myopencre-csv-download

64c8b44

Resolve nav conflicts and standardize MyOpenCRE links

a5f56ef

chore: resolve merge conflicts with upstream main

a4f988f

chore: resolve header merge conflict

35d5a0f

fix(header): restore sr-only utility for accessibility

b285409

PRAteek-singHWY mentioned this pull request Dec 28, 2025

feat(myopencre): improve CSV import error handling in UI #683

Open

northdpole reviewed Jan 3, 2026

View reviewed changes

PRAteek-singHWY and others added 7 commits January 4, 2026 04:27

Merge branch 'main' into myopencre-csv-download

8c929ce

Merge branch 'main' into myopencre-csv-download

3ddf7af

chore(osib): update OWASP name to Open Worldwide Application Security…

a83b918

… Project

feat(myopencre): enable CSV download of all CREs

735eb7b

feat(myopencre): add CSV upload UI and wire to existing import endpoint

57118a0

fix: correct SCSS block after merge conflict

fcd3dbd

PRAteek-singHWY force-pushed the myopencre-csv-import-validation branch from 53b85b0 to 197653b Compare January 3, 2026 23:25

PRAteek-singHWY and others added 4 commits January 4, 2026 05:14

Merge branch 'main' into myopencre-csv-import-validation

2bba24d

refactor(csv-import): move CSV validation into spreadsheet parser

d3e1ab4

refactor(csv): move import validation to spreadsheet parser entry point

c81609f

style(csv): format spreadsheet parser with black

cb089b8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(myopencre): improve CSV import validation (export-compatible) #682

feat(myopencre): improve CSV import validation (export-compatible) #682

Uh oh!

PRAteek-singHWY commented Dec 28, 2025 •

edited

Loading

Uh oh!

PRAteek-singHWY commented Dec 28, 2025

Uh oh!

northdpole Jan 3, 2026

Uh oh!

PRAteek-singHWY Jan 3, 2026

Uh oh!

PRAteek-singHWY commented Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(myopencre): improve CSV import validation (export-compatible) #682

Are you sure you want to change the base?

feat(myopencre): improve CSV import validation (export-compatible) #682

Uh oh!

Conversation

PRAteek-singHWY commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Note for reviewers

Summary

What this PR does

File-level validation

Schema / header validation

Row-level validation (export-compatible)

No-op import guard

What is intentionally ignored (by design)

What is out of scope for this PR

Why this approach

Dependency note

Feedback welcome

Uh oh!

PRAteek-singHWY commented Dec 28, 2025

Uh oh!

northdpole Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

PRAteek-singHWY Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

PRAteek-singHWY commented Jan 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PRAteek-singHWY commented Dec 28, 2025 •

edited

Loading